AMUSE: Multilingual Semantic Parsing for Question Answering over Linked Data
نویسندگان
چکیده
The task of answering natural language questions over RDF data has received wIde interest in recent years, in particular in the context of the series of QALD benchmarks. The task consists of mapping a natural language question to an executable form, e.g. SPARQL, so that answers from a given KB can be extracted. So far, most systems proposed are i) monolingual and ii) rely on a set of hard-coded rules to interpret questions and map them into a SPARQL query. We present the first multilingual QALD pipeline that induces a model from training data for mapping a natural language question into logical form as probabilistic inference. In particular, our approach learns to map universal syntactic dependency representations to a language-independent logical form based on DUDES (Dependency-based Underspecified Discourse Representation Structures) that are then mapped to a SPARQL query as a deterministic second step. Our model builds on factor graphs that rely on features extracted from the dependency graph and corresponding semantic representations. We rely on approximate inference techniques, Markov Chain Monte Carlo methods in particular, as well as Sample Rank to update parameters using a ranking objective. Our focus lies on developing methods that overcome the lexical gap and present a novel combination of machine translation and word embedding approaches for this purpose. As a proof of concept for our approach, we evaluate our approach on the QALD-6 datasets for English, German & Spanish.
منابع مشابه
Answering Natural Language Questions with Intui3
Intui3 is one of the participating systems at the fourth evaluation campaign on multilingual question answering over linked data, QALD4. The system accepts as input a question formulated in natural language (in English), and uses syntactic and semantic information to construct its interpretation with respect to a given database of RDF triples (in this case DBpedia 3.9). The interpretation is ma...
متن کاملبرچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملEvaluation of a Layered Approach to Question Answering over Linked Data
We present a question answering system architecture which processes natural language questions in a pipeline consisting of five steps: i) question parsing and query template generation, ii) lookup in an inverted index, iii) string similarity computation, iv) lookup in a lexical database in order to find synonyms, and v) semantic similarity computation. These steps are ordered with respect to th...
متن کاملApplying Semantic Parsing to Question Answering Over Linked Data: Addressing the Lexical Gap
Question answering over linked data has emerged in the past years as an important topic of research in order to provide natural language access to a growing body of linked open data on the Web. In this paper we focus on analyzing the lexical gap that arises as a challenge for any such question answering system. The lexical gap refers to the mismatch between the vocabulary used in a user questio...
متن کاملQALD-3: Multilingual Question Answering over Linked Data
The third edition of the open challenge on Question Answering over Linked Data (QALD-3) has put a strong emphasis on multilinguality. This paper provides an overview of the first task, focusing on multilingual question answering, which attracted six teams to submit results.
متن کامل